Proof Systems: Chaining, Treelike Proofs, and Resolution

Database/feature-matrix view of a knowledge base (DF)

If the knowledge base is a set of formulas (hypotheses) $KB=\lbrace h_1,\dots,h_k\rbrace $, we can also view it as the set of worlds consistent with all of them:

[ DF_{KB} := \lbrace \bar x\in X : \forall i\in[k],\ h_i(\bar x)=1\rbrace ] Equivalently, $DF_{KB}$ is the “database” of all assignments that satisfy the conjunction $\bigwedge_{i=1}^k h_i$.

Tiny example (two Boolean attributes)

Let the attributes be dog and furry animal. The rule

\[ \text{dog}\Rightarrow\text{furry animal} \]

eliminates exactly the world where dog=1 and furry animal=0. So in a 2×2 grid (columns: dog=0/1, rows: furry=0/1), the only forbidden cell is (dog=1, furry=0); the other 3 cells are in $DF$.

A set of possible world $X$ corresponding to the setting of $n$ Boolean attributes, e.g., “dog”, “furry animal”. Actual possibilities $D\subseteq X$, the set $\lbrace \bar x\in X, \text{"dog"}\implies\text{"furry animal"}\rbrace $ contains all of $D$, we say it is valid for $D$, $D\models\text{"dog"}\implies\text{"furry animal"}$

Declarative Knowledge

Formulas (or other representations) that are valid for $D$.

Knowledge Base

We’ll be given a set of representations $\lbrace h_1,\dots,h_k\rbrace $ called a knowledge base.

Implicitly, the knowledge base (KB) corresponds to that formula $\bigwedge_{i=1}^{k}h_i$.

Implicitly, if the $h_1,\dots,h_k$ are my knowledge base, we have $D\models\bigwedge_{i=1}^{k}h_i$

Reasoning Task

Given classes of representations, $Q$ (queries) and $H$ (hypotheses). The reasoning task is given a $KB\subseteq H$, query $q\in Q$, and it should output “accept” or “reject”.

If an algorithm only outputs “accept” when $KB\models q$, we say it is sound.

If an algorithm outputs “accept” whenever $KB\models q$, we say it is complete.

Observe

KB does not entail $q$ if and only if there is $\bar x^*\in X$ such that

$q(\bar x^*)=0$
$\forall h_i\in KB, h_i(\bar x^*)=1$

So, deciding whether $KB\models q$ is equivalent to whether $\lnot q(x)\land \left(\bigwedge^k_{i=1}h_i(x)\right)$ is satisfiable

So, if $H$ contains all clauses of 3 literals (ORs of less than or equal to 3 variables or negations of variables) and $Q$ has anything non-trivial (e.g., a tautology literal). 3 CNF satisfiability (3 SAT) is a special case of reasoning for $Q+H$ where 3 SAT is NP-complete (hence, intractable).

Thus, the complete requirement of the reasoning task can be weakened to provability in a proof system

Proof System

A proof system is given by a class of representations $J$ and a relations $R\in J\times J^*$ ($J^*$ is the finite length of $\phi\in J$). A proof of $\phi\in J$ from premises $h_1,\dots,h_k\in J$ is a sequence $\pi=\pi_1,\dots,\pi_l$ where $\pi_i\in J$ and for each $i\in[1,l]$, there is a subsequence $\pi'_1,\dots,\pi'_r$ from $\pi_1,\dots,\pi_{i-1}$ such that $(\pi'_i,\dots,\pi'_r)\in R$ or $\pi_i \in\lbrace h_1,\dots,h_k\rbrace $ with $\pi_l=\phi$.

If $(\pi_i)\in R$ (i.e., with $r=0$ prior lines) we say it is an axiom.

Otherwise, it follows from rule of inference.

We say that the proof system is sound if whenever $(\pi_1, \pi'_1,\dots,\pi'_r)$ then $\lbrace \pi'_1,\dots,\pi'_r\rbrace \models\pi_1$.

In particular, axioms must be tautologies (true for all $\bar x\in X$)

If the proof system is sound, and there is a proof of $\phi$ from $KB=\lbrace h_1,\dots,h_k\rbrace $, then $KB\models \phi$.

Chaining (Example of Proof System)

$J$ contains formulas of the form, $[\bigwedge^r_{i=1}l_i]\implies l^*$ for literals $l_1,\dots,l_r,l^*$ “rules” (when $r\geq 1$) and $l^*$ “facts” ($r=0$).

$R$ contains $\left(l^*, [\bigwedge^r_{i=1}l_i]\implies l^*, l_1, \dots, l_r\right)$. i.e., can conclude $l^*$ from rules + body literals.

Example: Chaining proof of `mischief`

Knowledge base (rules + facts):

$[\text{dog}\land \text{young}] \Rightarrow \text{puppy}$
$\text{puppy} \Rightarrow \text{mischief}$
$\text{dog}$
$\text{young}$

A chaining proof of mischief is:

From $\text{dog}$ and $\text{young}$ and $[\text{dog}\land \text{young}]\Rightarrow \text{puppy}$, infer $\text{puppy}$.
From $\text{puppy}$ and $\text{puppy}\Rightarrow \text{mischief}$, infer $\text{mischief}$.

So there is a proof of mischief from $\lbrace [\text{dog}\land \text{young}]\Rightarrow \text{puppy},\ \text{puppy}\Rightarrow \text{mischief},\ \text{dog},\ \text{young}\rbrace $.

Limitation: chaining cannot derive “new negations”

Consider the KB:

$[\text{dog}\land \text{young}] \Rightarrow \text{puppy}$
$\text{dog}$
$\lnot\text{puppy}$

Semantically, this KB entails $\lnot\text{young}$ (because if dog is true and puppy is false, then the antecedent dog ∧ young must be false, hence young must be false).

However, chaining cannot prove $\lnot\text{young}$ here: chaining only concludes - given facts, or - literals that appear in the heads (consequents) of rules.

Treelike proofs

It’s often helpful to draw a proof as a dependency graph: edges point from a derived line to the earlier lines used to justify it.

If the proof graph is a tree (equivalently: every earlier line is used at most once to justify a later step), we say the proof is treelike.

Resolution (Example of Proof System)

Let $J$ be the set of clauses, i.e., disjunctions (ORs) of literals.

The resolution proof system has the inference rule (sometimes called cut):

\[ \frac{(x_i \lor C)\quad (\lnot x_i \lor D)}{C \lor D} \]

That is: given a pair of clauses containing a complementary pair $x_i$ and $\lnot x_i$, we may conclude the OR of the “rest of the literals”, dropping $x_i$ and $\lnot x_i$.

Why the rule is sound

Claim: [ \lbrace x_i \lor C,\ \lnot x_i \lor D\rbrace \models C \lor D. ]

Proof (by cases on $x_i$):

If $x_i=0$, then $x_i\lor C$ being true implies $C$ is true, hence $C\lor D$ is true.
If $x_i=1$, then $\lnot x_i\lor D$ being true implies $D$ is true, hence $C\lor D$ is true.

So the inference preserves logical consequence.

Using resolution for proof by contradiction of a DNF query

Typically, we use resolution to refute the negation of a query.

Let the query be a DNF: [ q = T_1 \lor T_2 \lor \cdots \lor T_k, ] where each $T_i$ is a conjunction (AND) of literals.

Then [ \lnot q = \lnot(T_1\lor\cdots\lor T_k) = (\lnot T_1)\land(\lnot T_2)\land\cdots\land(\lnot T_k). ]

Each $\lnot T_i$ is equivalent to a clause $C_i$ (an OR of literals), so we take the set of clauses [ \lbrace C_1,\dots,C_k\rbrace ] as the hypotheses and attempt to derive the empty clause $\bot$.

The empty clause $\bot$ is false in every possible world.
Deriving $\bot$ means the hypotheses are unsatisfiable, i.e., $\lnot q$ is unsatisfiable, hence $q$ is valid (entailed) under the hypotheses.

Soundness intuition for refutations

If a resolution derivation reaches $\bot$, then somewhere (explicitly or implicitly) we have derived both $x_i$ and $\lnot x_i$ for some variable.

If the starting clauses were all satisfiable by some assignment $\bar x^*$, sound inference would preserve satisfiability along the derivation; but $\bot$ is unsatisfiable. Therefore, reaching $\bot$ implies there is no satisfying assignment to the starting clauses.